Automatic lexical stress and pitch accent detection for L2 English speech using multi-distribution deep neural networks

نویسندگان

  • Kun Li
  • Shaoguang Mao
  • Xu Li
  • Zhiyong Wu
  • Helen M. Meng
چکیده

This paper investigates the use of multi-distribution deep neural networks (MD-DNNs) for automatic lexical stress detection and pitch accent detection, which are useful for suprasegmental mispronunciation detection and diagnosis in second-language (L2) English speech. The features used in this paper cover syllable-based prosodic features (including maximum syllable loudness, syllable nucleus duration and a pair of dynamic pitch values) as well as lexical and syntactic features (encoded as binary variables). As stressed/accented syllables are more prominent than their neighbors, the two preceding and two following syllables are also taken into consideration. Experimental results show that the MD-DNN for lexical stress detection achieves an accuracy of 87.9% in syllable classification (primary/secondary/no stress) for words with three or more syllables. This performance is much better than those of our previous work using Gaussian mixture models (GMMs) and the prominence model (PM), whose accuracies are 72.1% and 76.3% respectively. Approached similarly as the lexical stress detector, the pitch accent detector obtains an accuracy of 90.2%, which is better than the results of using the GMMs and PM by about 9.6% and 6.9% respectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Classification of Lexical Stress in English and Arabic Languages Using Deep Learning

Prosodic features are important for the intelligibility and proficiency of stress-timed languages such as English and Arabic. Producing the appropriate lexical stress is challenging for second language (L2) learners, in particular, those whose first language (L1) is a syllable-timed language such as Spanish, French, etc. In this paper we introduce a method for automatic classification of lexica...

متن کامل

Lexical stress detection for L2 English speech using deep belief networks

This paper investigates lexical stress detection for L2 English speech using Deep Belief Networks (DBNs). The features of the DBN used in this work include the syllable-based prosodic features (assumed to have Gaussian distribution) and their expected lexical stress (assumed to have Bernoulli distribution). As stressed syllables are more prominent than their neighbors, the two preceding and two...

متن کامل

From English pitch accent detection to Mandarin stress detection, where is the difference?

Although English pitch accent detection has been studied extensively, there relatively a few works explore Mandarin stress etection. Moreover, the comparison and analysis between Mandarin stress detection and English pitch accent detection have not een touched for such counterpart tasks. In this paper, we discuss Mandarin stress detection and compare it with English pitch accent etection. The c...

متن کامل

Is Acquisition of L2 Phonemes Difficult? Production of English Stress by Japanese Speakers

This study examined the production of English lexical stress by Japanese speakers to determine which acoustic features associated with English lexical stress are difficult for Japanese speakers to acquire. Realization of lexical accent differs between languages. English is a stress-accent language where the accent is expressed by a combination of pitch, duration, intensity and vowel quality (Fr...

متن کامل

Integrating acoustic and state-transition models for free phone recognition in L2 English speech using multi-distribution deep neural networks

This paper investigates the use of Multi-Distribution Deep Neural Networks (MD-DNNs) for integrating acoustic and statetransition models in free phone recognition of L2 English speech. In Computer-Aided Pronunciation Training (CAPT) system, free phone recognition for L2 English speech is the key model of Mispronunciation Detection and Diagnosis (MDD) in the cases of allowing freely speaking. A ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Speech Communication

دوره 96  شماره 

صفحات  -

تاریخ انتشار 2018